Aesthetics for points

Gplot Packages and dataset

options(repos = "https://cran.cnr.berkeley.edu/")
install.packages("ggplot2")
## Warning: unable to access index for repository https://cran.cnr.berkeley.edu/src/contrib:
##   cannot open URL 'https://cran.cnr.berkeley.edu/src/contrib/PACKAGES'
## Warning: package 'ggplot2' is not available for this version of R
## 
## A version of this package for your version of R might be available elsewhere,
## see the ideas at
## https://cran.r-project.org/doc/manuals/r-patched/R-admin.html#Installing-packages
## Warning: unable to access index for repository https://cran.cnr.berkeley.edu/bin/macosx/contrib/4.2:
##   cannot open URL 'https://cran.cnr.berkeley.edu/bin/macosx/contrib/4.2/PACKAGES'
install.packages("palmerpenguins")
## Warning: unable to access index for repository https://cran.cnr.berkeley.edu/src/contrib:
##   cannot open URL 'https://cran.cnr.berkeley.edu/src/contrib/PACKAGES'
## Warning: package 'palmerpenguins' is not available for this version of R
## 
## A version of this package for your version of R might be available elsewhere,
## see the ideas at
## https://cran.r-project.org/doc/manuals/r-patched/R-admin.html#Installing-packages
## Warning: unable to access index for repository https://cran.cnr.berkeley.edu/bin/macosx/contrib/4.2:
##   cannot open URL 'https://cran.cnr.berkeley.edu/bin/macosx/contrib/4.2/PACKAGES'
library("ggplot2")
library("palmerpenguins")

Ggplot visualization - plus (+) signs is use to add layers to the plot; geom_point to create points to represent data; and mapping=aes refers to y and x axis, how the data will look. Mapping of the data

ggplot(data=penguins) + geom_point(mapping = aes(x=flipper_length_mm, y=body_mass_g))
## Warning: Removed 2 rows containing missing values (`geom_point()`).

Adding “colors” to the aesthetic (aes) to identify “species” in palmer penguins

ggplot(data=penguins) + geom_point(mapping = aes(x=flipper_length_mm, y=body_mass_g, color=species))
## Warning: Removed 2 rows containing missing values (`geom_point()`).

Adding “shapes” to the aesthetic (aes) to identify “species” in palmer penguins

ggplot(data=penguins) + geom_point(mapping = aes(x=flipper_length_mm, y=body_mass_g, shape=species))
## Warning: Removed 2 rows containing missing values (`geom_point()`).

Adding “colors” and “shapes” to the aesthetic (aes) to identify “species” in palmer penguins

ggplot(data=penguins) + geom_point(mapping = aes(x=flipper_length_mm, y=body_mass_g, shape=species, color=species))
## Warning: Removed 2 rows containing missing values (`geom_point()`).

Adding “size”, “colors” and “shapes” to the aesthetic (aes) to identify “species” in palmer penguins

ggplot(data=penguins) + geom_point(mapping = aes(x=flipper_length_mm, y=body_mass_g, shape=species, color=species, size=species))
## Warning: Using size for a discrete variable is not advised.
## Warning: Removed 2 rows containing missing values (`geom_point()`).

Using alpha with “species” to control transparency of data

ggplot(data=penguins) + geom_point(mapping = aes(x=flipper_length_mm, y=body_mass_g, alpha=species))
## Warning: Using alpha for a discrete variable is not advised.
## Warning: Removed 2 rows containing missing values (`geom_point()`).

Changing color for all data points.

ggplot(data=penguins) + geom_point(mapping = aes(x=flipper_length_mm, y=body_mass_g), color="purple")
## Warning: Removed 2 rows containing missing values (`geom_point()`).

Geom Functions, point, bar, line etc…

Geom_smooth

ggplot(data=penguins) + geom_smooth(mapping = aes(x=flipper_length_mm, y=body_mass_g))
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
## Warning: Removed 2 rows containing non-finite values (`stat_smooth()`).

Adding additional geoms

ggplot(data=penguins) + geom_smooth(mapping = aes(x=flipper_length_mm, y=body_mass_g)) + geom_point(mapping = aes(x=flipper_length_mm, y=body_mass_g)) 
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
## Warning: Removed 2 rows containing non-finite values (`stat_smooth()`).
## Warning: Removed 2 rows containing missing values (`geom_point()`).

Adding line type to identify different species

ggplot(data=penguins) + geom_smooth(mapping = aes(x=flipper_length_mm, y=body_mass_g, linetype=species))
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
## Warning: Removed 2 rows containing non-finite values (`stat_smooth()`).

Using jitter function, to makes points easier to find when data overlaps

ggplot(data=penguins) + geom_jitter(mapping = aes(x=flipper_length_mm, y=body_mass_g))
## Warning: Removed 2 rows containing missing values (`geom_point()`).

Bar Chart with Diamon data, When “y” axis isn’t specify it automatically turns into COUNT (or Data Frenquency)

ggplot(data=diamonds) + 
  geom_bar(mapping = aes(x=cut))

Adding color to the cut

ggplot(data=diamonds) + 
  geom_bar(mapping = aes(x=cut, color=cut))

Using fill to add color inside bar for cut

ggplot(data=diamonds) + 
  geom_bar(mapping = aes(x=cut, fill=cut))

Using fill to add color inside bar for clarity. The result is a stacked bar identifying clarity inside cut bars to identify volume

ggplot(data=diamonds) + 
  geom_bar(mapping = aes(x=cut, fill=clarity))

Using facets, to display smaller groups or subsets in data

Facet_wrap function, using species

ggplot(data=penguins) + geom_point(mapping = aes(x=flipper_length_mm, y=body_mass_g, color=species)) + facet_wrap(~species)
## Warning: Removed 2 rows containing missing values (`geom_point()`).

Facet_wrap function, using cut from Diamond dataset

ggplot(data=diamonds) + geom_bar(mapping=aes(x=color, fill=cut)) + facet_wrap(~cut)

Facet_grid function, using with penguins dataset. Must usefull when exploring relationship between multiple groups; sex and species

ggplot(data=penguins) + geom_point(mapping = aes(x=flipper_length_mm, y=body_mass_g, color=species)) + facet_grid(sex~species)
## Warning: Removed 2 rows containing missing values (`geom_point()`).

Using facet_grid to look at different species

ggplot(data=penguins) + geom_point(mapping = aes(x=flipper_length_mm, y=body_mass_g, color=species)) + facet_grid(~species)
## Warning: Removed 2 rows containing missing values (`geom_point()`).

Using facet_grid to look for sex

ggplot(data=penguins) + geom_point(mapping = aes(x=flipper_length_mm, y=body_mass_g, color=species)) + facet_grid(~sex)
## Warning: Removed 2 rows containing missing values (`geom_point()`).

Labels & Annotations

Adding title using labs

ggplot(data = penguins) + 
  geom_point(mapping = aes(x = flipper_length_mm, y = body_mass_g, color = species)) +
  labs(title = "Palmer Penguins: Body Mass vs. Flipper Length")
## Warning: Removed 2 rows containing missing values (`geom_point()`).

Adding subtitle and caption

ggplot(data = penguins) + 
  geom_point(mapping = aes(x = flipper_length_mm, y = body_mass_g, color = species)) +
  labs(title = "Palmer Penguins: Body Mass vs. Flipper Length", subtitle = "Sample of Three Penguin Species", caption="Data collected by Dr. Kristen Gorman")
## Warning: Removed 2 rows containing missing values (`geom_point()`).

Using annotate function, the location is specify using x and y axis values

ggplot(data = penguins) + 
  geom_point(mapping = aes(x = flipper_length_mm, y = body_mass_g, color = species)) +
  labs(title = "Palmer Penguins: Body Mass vs. Flipper Length", subtitle = "Sample of Three Penguin Species", caption="Data collected by Dr. Kristen Gorman") +
  annotate("text", x=220, y=3500, label="The Gentoos are the largest")
## Warning: Removed 2 rows containing missing values (`geom_point()`).

Adding color, size and angle to annotations

ggplot(data = penguins) + 
  geom_point(mapping = aes(x = flipper_length_mm, y = body_mass_g, color = species)) +
  labs(title = "Palmer Penguins: Body Mass vs. Flipper Length", subtitle = "Sample of Three Penguin Species", caption="Data collected by Dr. Kristen Gorman") +
  annotate("text", x=220, y=3500, label="The Gentoos are the largest", color="purple", fontface="bold", size=4.5, angle=0)
## Warning: Removed 2 rows containing missing values (`geom_point()`).

Saving plot. Usefull for adding layers. Example with annotations

p <- ggplot(data = penguins) + 
  geom_point(mapping = aes(x = flipper_length_mm, y = body_mass_g, color = species)) +
  labs(title = "Palmer Penguins: Body Mass vs. Flipper Length", subtitle = "Sample of Three Penguin Species", caption="Data collected by Dr. Kristen Gorman")

p + annotate("text", x=220, y=3500, label="The Gentoos are the largest", color="purple", fontface="bold", size=4.5, angle=0)
## Warning: Removed 2 rows containing missing values (`geom_point()`).

Saving Visualizations

ggplot(data=penguins) + geom_point(mapping = aes(x=flipper_length_mm, y=body_mass_g, shape=species, color=species))
## Warning: Removed 2 rows containing missing values (`geom_point()`).

Using ggsave the last visual is save

ggsave("Three Penguin Species.png")
## Saving 7 x 5 in image
## Warning: Removed 2 rows containing missing values (`geom_point()`).